IJTCS | 分论坛日程:多智能体强化学习
编者按
首届国际理论计算机联合大会(International Joint Conference on Theoretical Computer Science,IJTCS)将于2020年8月17日-21日在线上举行,由北京大学与中国工业与应用数学学会(CSIAM)、中国计算机学会(CCF)、国际计算机学会中国委员会(ACM China Council)联合主办,北京大学前沿计算研究中心承办。
本次大会的主题为“理论计算机科学领域的最新进展与焦点问题”。大会共设7个分论坛,分别对算法博弈论、区块链技术、多智能体强化学习、机器学习理论、量子计算、机器学习与形式化方法和算法与复杂性等领域进行深入探讨。同时,大会特别开设了青年博士论坛、女性学者论坛与本科生科研论坛,荟集海内外知名专家学者,聚焦理论计算机前沿问题。有关信息将持续更新,敬请关注!
本期带来“多智能体强化学习”分论坛精彩介绍。
“多智能体强化学习”介绍
多智能体强化学习是近年来新兴的研究领域,它结合博弈论与深度强化学习,致力于解决复杂状态、动作空间下的群体智能决策问题,在游戏AI、工业机器人、社会预测等方面具有广泛的应用前景。当前,中国研究者在多智能体算法收敛性理论、多智能体通讯机制学习算法、大规模多智能体系统等问题取得许多进展,正与全世界的研究者一道推进多智能体强化学习的研究。本次 IJTCS MARL Track 将聚焦多智能体通讯算法、基于世界模型的强化学习算法、多智能体策略评估、多智能体强化学习的解概念等前沿课题,希望与广大研究者一同探讨多智能体强化学习的未来发展方向。
“多智能体强化学习”分论坛主席
李文新
北京大学
张海峰
中国科学院自动化研究所
“多智能体强化学习”分论坛议程
时间:2020年8月18日
“多智能体强化学习”分论坛报告简介
张国川
Online Search and Pursuit-Evasion in Robotics
Abstract
In search and pursuit-evasion problems one team of mobile entities are requested to seek, a set of fixed objects or capture another team of moving objects in an environment. Searching strategy or motion planning plays a key role in any scenario. In this talk we briefly introduce several exploration and search models in an unknown environment, and propose a number of challenging algorithmic problems.
王冬鸽
A Distance Function to Nash Equilibrium
Abstract
Nash equilibrium has long been a desired solution concept in economics and game theoretical studies. Although the related complexity literature closed the door to efficiently compute the exact equilibrium, approximation methods are still sought after in its various application fields, such as online marketing, crowdsourcing, sharing economy and so on. In this paper, we present a new approach to obtain approximate Nash equilibrium in any N-player normal-form zero-sum game with discrete action spaces, which is applicable to solve any general N-player game with some pre-processing. Our approach defines a new measure for the distance between the current joint strategy profile of players and that of a Nash equilibrium. The computing process transforms the task of finding the equilibrium into one of finding a global minimization solution. We solve it based on a gradient descent algorithm and further prove the convergences of our algorithm under moderate assumptions. We next compare our algorithm with baselines by experiments, show consistent and significant improvement in approximate Nash equilibrium computation and show the robustness of the algorithm as the game size increases.
张伟楠
Model-based Multi-Agent Reinforcement Learning
Abstract
Multi-agent reinforcement learning (MARL) typically suffers from low sample efficiency due to useless multi-agent exploration in the state & joint action space. In single-agent RL tasks, there has been an increasing interest of building environment dynamics model and performing model-based RL to improve the sample efficiency. In this talk, I will perform an attempt to build model-based methods to achieve sample-efficient MARL. First, I will discuss several important settings of model-based MARL tasks and the key challenges there. Then I will delve into the decentralized model-based MARL setting, which can be used on almost all decentralized model-free methods of MARL. Theoretic bound on policy value discrepancy will be derived, based on which an effiicient decentralized model-based MARL algorithm will be introduced. Further, I will show the preliminary experimental results. The final takeaway of this talk will be the discussion of feasibility and challenges of model-based MARL.
张海峰
Solution Concepts in Multi-agent Reinforcement Learning
Abstract
Nash equilibrium has long been a well-studied solution concept in game theory. Naturally, multi-agent reinforcement learning algorithms usually set Nash equilibrium as the laerning objective. However, in many situations, other solution concepts such as Stackelberg equilibrium and correlated equilibrium have potential to perform better than Nash equilibrium. In this talk, we will talk about two MARL algorithms, bi-level actor-citic (Bi-AC) and signal instructed coordination (SIC), which aim to solving Stackelberg and correlated equilibrium respectively.
卢宗青
Learning Multi-Agent Cooperation
Abstract
Cooperation is a widespread phenomenon in nature, from viruses, bacteria, and social amoebae to insect societies, social animals, and humans. It is also crucially important to enable agents to learn to cooperate in multi-agent environments for many applications, e.g., autonomous driving, multi-robot control, traffic light control, smart grid control, network optimization, etc. In this talk, I will focus on the latest reinforcement learning methods for multi-agent cooperation via joint policy learning, communication, agent modeling, etc.
李文新
An Overview of Game-Based AI Competitions---From a Perspective of AI Evaluation
Abstract
Intelligence exists when we measure it! A game-based AI competition explicitly depicts our imagination of intelligence, therefore recently, holding this kind of competition is quite popular in AI conferences such as AAAI, IJCAI. With its bright and accurate definition of problems, unified platform environment, fair performance assessment mechanism, open data set, and benchmark, game-based AI competition has attracted many researchers, thus accelerating the development of artificial intelligence technology.
There is a new trend of game-based competitions that hosts a competition for a long time with an online platform, and this will encourage researchers and fans of AI to continuously work on a task and share information at any time. The platform enables us to test the learning ability of bots as well. In this trend, we are facing the problem of evaluating an enormous amount of bots quickly and fairly.
Through the collection and analysis of various competitions, this paper finds that the games used in the competitions are becoming more complex, and the techniques used in the matches are also becoming more complex. The judgment for a match becomes more time consuming and sometimes yield results with randomness. These problems, combined with an increase in the number of participants, have led to the need for organizers to improve the race process to produce fair results on time.
An emerging MCTS (Monte Carlo Tree Search) based AI evaluation method is worthy of our attention. Hopefully, this method may measure the intelligent levels of a bot quantitatively and possibly compare bots created for different games. Besides the above, measuring a bot’s cooperative ability in a multi-agent (three agents or more) system is still an open problem.
关于IJTCS
简介 → 国际理论计算机联合大会重磅登场
推荐 → 大会特邀报告(一)
推荐 → 大会特邀报告(二)
日程 → 分论坛:算法博弈论
日程 → 分论坛:区块链技术
日程 → 分论坛:机器学习理论
日程 → 分论坛:量子计算
IJTCS注册信息
本次大会已经正式面向公众开放注册!每位参与者可以选择免费注册以观看线上报告,或是支付一定费用以进一步和讲者就报告内容进行交流,深度参与大会的更多环节。
观看线上报告:免费
完全注册:
(普通)$100 /¥700
(学生)$50 /¥350*
作为参会人参加全部会议,直接在线提问讨论并参与特设互动环节
注册截止:2020年8月15日23:59
点击 ↓↓↓二维码↓↓↓ 跳转注册页面:
*学生注册:网站上注册后需将学生证含有个人信息和学校信息的页拍照发送至IJTCS@pku.edu.cn,邮件主题格式为"Student Registration + 姓名"。
大会主席
John Hopcroft
中国科学院外籍院士、北京大学访问讲席教授
林惠民
中国科学院院士、中国科学院软件研究所专家
大会联合主席
邓小铁
北京大学教授
顾问委员会主席
高 文
中国工程院院士、北京大学教授
梅 宏
中国科学院院士、CCF理事长
张平文
中国科学院院士、CSIAM理事长、北京大学教授
组织单位
欢迎注册
大会网站:
https://econcs.pku.edu.cn/ijtcs2020/IJTCS2020.html
注册链接:
https://econcs.pku.edu.cn/ijtcs2020/Registration.htm
联系人
大会赞助、合作等信息,请联系:IJTCS@pku.edu.cn
— 版权声明 —
本微信公众号所有内容,由北京大学前沿计算研究中心微信自身创作、收集的文字、图片和音视频资料,版权属北京大学前沿计算研究中心微信所有;从公开渠道收集、整理及授权转载的文字、图片和音视频资料,版权属原作者。本公众号内容原作者如不愿意在本号刊登内容,请及时通知本号,予以删除。
点“阅读原文”转大会注册页面